realistic scenario
SIMILAR: Submodular Information Measures Based Active Learning In Realistic Scenarios
Active learning has proven to be useful for minimizing labeling costs by selecting the most informative samples. However, existing active learning methods do not work well in realistic scenarios such as imbalance or rare classes,out-of-distribution data in the unlabeled set, and redundancy. In this work, we propose SIMILAR (Submodular Information Measures based actIve LeARning), a unified active learning framework using recently proposed submodular information measures (SIM) as acquisition functions. We argue that SIMILAR not only works in standard active learning but also easily extends to the realistic settings considered above and acts as a one-stop solution for active learning that is scalable to large real-world datasets. Empirically, we show that SIMILAR significantly outperforms existing active learning algorithms by as much as ~5% 18%in the case of rare classes and ~5% 10%in the case of out-of-distribution data on several image classification tasks like CIFAR-10, MNIST, and ImageNet.
comments and criticism
We would like to thank the reviewers for their comments and helpful suggestions. We respond below to the reviewers' R1: The term "privacy model" is confusing: We will definitely think of a better term. R1: More discussion of the broader impact" We will elaborate more on the potential impact of the proposed We believe it can be a basis for a more flexible, more expressive framework for DP learning. R3: "Intuitively, the sample complexity should be comparable to that of the non-private one": We disagree with the R3: "I'd expect some simple but a bit more advance model, like a Bernoulli model . . . We also want to point out that the label-determined model does capture some realistic scenarios.
- North America > United States > Texas (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Asia > Middle East > Jordan (0.04)
SIMILAR: Submodular Information Measures Based Active Learning In Realistic Scenarios
Active learning has proven to be useful for minimizing labeling costs by selecting the most informative samples. However, existing active learning methods do not work well in realistic scenarios such as imbalance or rare classes,out-of-distribution data in the unlabeled set, and redundancy. In this work, we propose SIMILAR (Submodular Information Measures based actIve LeARning), a unified active learning framework using recently proposed submodular information measures (SIM) as acquisition functions. We argue that SIMILAR not only works in standard active learning but also easily extends to the realistic settings considered above and acts as a one-stop solution for active learning that is scalable to large real-world datasets. Empirically, we show that SIMILAR significantly outperforms existing active learning algorithms by as much as 5% 18%in the case of rare classes and 5% 10%in the case of out-of-distribution data on several image classification tasks like CIFAR-10, MNIST, and ImageNet.
An LLM-Guided Tutoring System for Social Skills Training
Guevarra, Michael, Bhattacharjee, Indronil, Das, Srijita, Wayllace, Christabel, Epp, Carrie Demmans, Taylor, Matthew E., Tay, Alan
Social skills training targets behaviors necessary for success in social interactions. However, traditional classroom training for such skills is often insufficient to teach effective communication -- one-to-one interaction in real-world scenarios is preferred to lecture-style information delivery. This paper introduces a framework that allows instructors to collaborate with large language models to dynamically design realistic scenarios for students to communicate. Our framework uses these scenarios to enable student rehearsal, provide immediate feedback, and visualize performance for both students and instructors. Unlike traditional intelligent tutoring systems, instructors can easily co-create scenarios with a large language model without technical skills. Additionally, the system generates new scenario branches in real time when existing options do not fit the student's response.
- North America > Canada > Alberta (0.16)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > New Mexico (0.05)
- North America > United States > Michigan > Wayne County > Dearborn (0.05)
ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation
Zheng, Jingnan, Wang, Han, Zhang, An, Nguyen, Tai D., Sun, Jun, Chua, Tat-Seng
Large Language Models (LLMs) can elicit unintended and even harmful content when misaligned with human values, posing severe risks to users and society. To mitigate these risks, current evaluation benchmarks predominantly employ expert-designed contextual scenarios to assess how well LLMs align with human values. However, the labor-intensive nature of these benchmarks limits their test scope, hindering their ability to generalize to the extensive variety of open-world use cases and identify rare but crucial long-tail risks. Additionally, these static tests fail to adapt to the rapid evolution of LLMs, making it hard to evaluate timely alignment issues. To address these challenges, we propose ALI-Agent, an evaluation framework that leverages the autonomous abilities of LLM-powered agents to conduct in-depth and adaptive alignment assessments. ALI-Agent operates through two principal stages: Emulation and Refinement. During the Emulation stage, ALI-Agent automates the generation of realistic test scenarios. In the Refinement stage, it iteratively refines the scenarios to probe long-tail risks. Specifically, ALI-Agent incorporates a memory module to guide test scenario generation, a tool-using module to reduce human labor in tasks such as evaluating feedback from target LLMs, and an action module to refine tests. Extensive experiments across three aspects of human values--stereotypes, morality, and legality--demonstrate that ALI-Agent, as a general evaluation framework, effectively identifies model misalignment. Systematic analysis also validates that the generated test scenarios represent meaningful use cases, as well as integrate enhanced measures to probe long-tail risks. Our code is available at https://github.com/SophieZheng998/ALI-Agent.git
- North America > United States (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
- (3 more...)
- Health & Medicine (1.00)
- Transportation (0.94)
- Information Technology > Security & Privacy (0.93)
- Law (0.93)
Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Realistic Incomplete Data Scenarios
Fan, Qi, Zuo, Haolin, Liu, Rui, Lian, Zheng, Gao, Guanglai
Multimodal emotion recognition (MER) in practical scenarios presents a significant challenge due to the presence of incomplete data, such as missing or noisy data. Traditional methods often discard missing data or replace it with a zero vector, neglecting the availability issue of noisy data. Consequently, these approaches are not fully applicable to realistic scenarios, where both missing and noisy data are prevalent. To address this problem, we propose a novel noise-robust MER model, named NMER, which effectively learns robust multimodal joint representations from incomplete data containing noise. Our approach incorporates two key components. First, we introduce a noise scheduler that adjusts the type and level of noise in the training data, emulating the characteristics of incomplete data in realistic scenarios. Second, we employ a Variational AutoEncoder (VAE)-based NMER model to generate robust multimodal joint representations from the noisy data, leveraging the modality invariant feature. The experimental results on the benchmark dataset IEMOCAP indicate the proposed NMER outperforms state-of-the-art MER systems. The ablation results also confirm the effectiveness of the VAE structure. We release our code at \href{https://github.com/WooyoohL/Noise-robust_MER.
- Asia > South Korea > Incheon > Incheon (0.04)
- Asia > Mongolia (0.04)
- Asia > China > Inner Mongolia > Hohhot (0.04)
- (2 more...)
µTransfer: A technique for hyperparameter tuning of enormous neural networks - Microsoft Research
Great scientific achievements cannot be made by trial and error alone. Every launch in the space program is underpinned by centuries of fundamental research in aerodynamics, propulsion, and celestial bodies. In the same way, when it comes to building large-scale AI systems, fundamental research forms the theoretical insights that drastically reduce the amount of trial and error necessary and can prove very cost-effective. In this post, we relay how our fundamental research enabled us, for the first time, to tune enormous neural networks that are too expensive to train more than once. We achieved this by showing that a particular parameterization preserves optimal hyperparameters across different model sizes. This is the µ-Parametrization (or µP, pronounced "myu-P") that we introduced in a previous paper, where we showed that it uniquely enables maximal feature learning in the infinite-width limit.
Continuous Coordination As a Realistic Scenario for Lifelong Learning
Nekoei, Hadi, Badrinaaraayanan, Akilesh, Courville, Aaron, Chandar, Sarath
Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. Lifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of LLL algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi -- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art LLL algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unseen agents, without any additional assumptions made by previous works.
- Instructional Material (0.57)
- Research Report (0.50)
- Overview (0.46)
- Education > Educational Setting > Continuing Education (0.83)
- Leisure & Entertainment > Games (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)